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Abstract 

Metric data structures (distance oracles, distance labeling schemes, routing schemes) and low-distortion 
embeddings provide a powerful algorithmic methodology, which has been successfully applied for ap¬ 
proximation algorithms [LLR95], online algorithms [BBMN11], distributed algorithms [KKM+12] and 
for computing sparsifiers [ST04], However, this methodology appears to have a limitation: the worst- 
case performance inherently depends on the cardinality of the metric, and one could not specify in 
advance which vertices/points should enjoy a better service (i.e., stretch/distortion, label size/dimension) 
than that given by the worst-case guarantee. 

In this paper we alleviate this limitation by devising a suit of prioritized metric data structures and 
embeddings. We show that given a priority ranking (x!, x 2 > • ■ ■, x n ) of the graph vertices (respectively, 
metric points) one can devise a metric data structure (respectively, embedding) in which the stretch 
(resp., distortion) incurred by any pair containing a vertex x ;) will depend on the rank j of the vertex. 
We also show that other important parameters, such as the label size and (in some sense) the dimension, 
may depend only on j. In some of our metric data structures (resp., embeddings) we achieve both 
prioritized stretch (resp., distortion) and label size (resp., dimension) simultaneously. The worst-case 
performance of our metric data structures and embeddings is typically asymptotically no worse than of 
their non-prioritized counterparts. 


1 Introduction 


The celebrated distance oracle of Thorup and Zwick [TZ05] enables one to preprocess an undirected weighted 
n-vertex graph G = (V, E ) so that to produce a data structure (aka distance oracle) of size ()(t ■ n l + I,// ') 
(for a parameter t = 1,2,...) that supports distance queries between pair's u, v £ V in time ()(t) per query. 
(The query time was recently improved to 0(1) by [Chel4, Wull3].) The distance estimates provided by 
the oracle are within a factor of 2t — 1 from the actual distance d.(;(u. v ) between u and v in G. The ap¬ 
proximation factor (2 1 — 1 in this case) is called the stretch. Distance oracles can serve as an example of 
a metric data structure ; other very well-studied examples include distance labeling [Pel99, GPPR01] and 
routing [TZOla, AP92]. Thorup-Zwick’s oracle can also be converted into a distance-labeling scheme: each 
vertex is assigned a label of size 0(n 1//f • log 1 ~ 1// ' n) so that given labels of u and v the query algorithm 
can provide a (2 1 — 1)-approximation of dc(u, v). Moreover, the oracle also gives rise to a routing scheme 
[TZOla] that exhibits a similar tradeoff. 

A different but closely related thread of research concerns low-distortion embeddings. A celebrated 
theorem of Bourgain [Bou86] asserts that any n-point metric (X , d) can be embedded into an O(logn)- 
dimensional Euclidean space with distortion O(logn). (Roughly speaking, distortion and stretch are the 
same thing. See Section 2 for formal definitions.) Fakcharoenphol et al. [FRT04] (following Bartal [Bar96, 
Bar98]) showed that any mertic (X. d ) embeds into a distribution over trees (in fact, ultrametrics) with 
expected distortion (9 (log n). 

These (and many other) important results are not only appealing from a mathematical perspective, but 
they also were found extremely useful for numerous applications in Theoretical Computer Science and 
beyond [FFR95, BBMN11, KKM + 12, ST04]. A natural disadvantage is the dependence of all the relevant 
parameters on n, the cardinality of the input graph/metric. However, all these results are either completely 
tight, or very close to being completely tight. In order to address this issue, metric data structures and 
embeddings in which some pairs of vertices/points enjoy better stretch/distortion or with improved label 
size/dimension were developed. Specifically, [KSW09, ABC + 05, ABN11, CDG06] studied embeddings 
and distance oracles in which the distortion/stretch of at least 1 — e fraction of the pairs is improved as a 
function of e, either for a fixed e or for all e £ [0,1] simultaneously (e.g. for a fixed e, embeddings into 
Euclidean space of dimension 0( log 1/e) with distortion 0(log(l/e)), or a distance oracle with stretch 
2 \t ■ lo ^g^ 1 + 1 for 1 — e fraction of the pairs). Also, [ABN07, SS09, AC 14] devised embeddings and 
distance oracles that provide distortion/stretch 0(log k) for all pairs (x, y ) of points such that y is among the 
k closest points to x, and distance labeling schemes that support queries only between fc-nearest neighbors, 
in which the label size depends only on k rather than n. 

An inherent shortcoming of these results is, however, that the pairs that enjoy better than worst-case 
distortion cannot be specified in advance. In this paper we alleviate this shortcoming and devise a suit 
of prioritized metric data structures and low-distortion embeddings. Specifically, we show that one can 
order the graph vertices V = (aq,..., x n ) arbitrarily in advance, and devise metric data structures (i.e., 
oracles/labelings/routing schemes) that, for a parameter t = 1,2,..., provide stretch 2 \t- — 1 (instead 

of 2 1 — 1) for all pairs involving Xj, while using the same space as corresponding non-prioritized data 
structures! In some cases the label size can be simultaneously improved for the high priority points, as 
described in the sequel. 

The same phenomenon occurs for low-distortion embeddings. We devise an embedding of general met¬ 
rics into an 0(log n) -dimensional Euclidean space that provides prioritized distortion 0{ log j-(log log j) 1 / 2+e ), 
for any constant e > 0 (i.e., the distortion for all pairs containing xj is 0(log j ■ (loglog j) 1 / 2+e )). Similarly, 
our embedding into a distribution of trees provides prioritized expected distortion 0(log j). 
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We introduce a novel notion of improved dimension for high priority points. In general we cannot 
expect that the dimension of an Euclidean embedding with low distortion (even prioritized) will be small 
(as Euclidean embedding into dimension D has worst-case distortion of Q(n 1/7) • logn) for some metrics 
[ABN11]). What we can offer is an embedding in which the high ranked points have only a few ’’active” 
coordinates. That is, only the first O (poly(log j)) coordinates in the image of Xj will be nonzero, while the 
distortion is also bounded by 0(poly(log j)). This could be useful in a setting where the high ranked points 
participate in numerous computations, then since representing these points requires very few coordinates, 
we can store many of them in the cache or other high speed memory. We remark that our framework is the 
first which allows simultaneously improved distortion and dimension (or improved stretch and label size) 
for the high priority points, while providing some guarantee for all pairs. 

We have a construction of prioritized distance oracles that exhibits a qualitatively different behavior 
than of our aforementioned oracles. Specifically, we devise a distance oracle with space 0(n log logn) 

(respectively, 0(n log* n)) and prioritized stretch (respectively, 2 Observe that as 

long as j < n 1_e for any fixed e > 0, the prioritized stretch of both these oracles is 0(1). The query 
time is 0(1). These oracles are, however, not path-reporting (a path reporting oracle can return an actual 
approximate shortest path in the graph, in time proportional to its length). We also devise a path-reporting 
prioritized oracle, which was mentioned above: it has space 0{t ■ n 1+1 /*), stretch 2 \t • — 1, and 

the query time 1 is 0{t ■ J°®^). In the full version of this paper we also devise a path-reporting prioritized 
distance oracle (extending [EP15]) with space 0(n log logn), stretch 0((p^yjy) log4/3 7 )> and query time 
0(log( ))■ (Observe that this stretch and query time are 0(1) for all j < n 1 f .) 

This second oracle can be distributed as a labeling scheme, in which not only the stretch 2 \t ■ — 1 

is prioritized, but also the label size is smaller for high priority points: it is 0(n 1,// • logj) rather than the 
non-prioritized 0(n 1/7 • logn). In our routing scheme, if j is the priority rank of the destination Xj, it has 
prioritized stretch A\t ■ — 3 (instead of 4f — 5), the routing tables have size 0(n l/7 • logj) (instead of 

0(n 1/7 • logn)), and labels have size 0(log j • \t j^ 7 ]) (instead of 0(t ■ logn)). 

We also consider the dual setting in which the stretch is fixed, and label size A (j) of x } is smaller when 
j <C n. The function A (j) will be called prioritized label size. Specifically, with prioritized label size 
()(j 1 ' 1 ■ logj) we can have stretch 2 1 — 1. For certain points on the tradeoff curve we can even have both 
stretch and label size prioritized simultaneously! In particular, a variant of our distance labeling scheme 
provides a prioritized stretch 2(log.)] — 1 and prioritized label size 0(log j). For routing we have similar 
gaurantees independent of n. We also devise a distance labeling scheme for graphs that exclude a fixed 
minor with stretch 1 + e and prioritized label size 0(l/e • logj) (extending [AG06, ThoOl]). 

Another notable result in this context is our prioritized embedding into a single tree. It is well-known 
that any metric can be embedded into a single dominating tree with linear distortion, and that it is tight 
[RR98]. We show that any n-point metric (X, d) enjoys an embedding into a single dominating tree with 
prioritized distortion a(j) if and only if the sum of reciprocals ff j'- \ 1 /«(.)) converges. In particular, 
prioritized distortion a(j ) = j • logj • (loglog j) 1 ' 01 is admissible, while a(j ) = j • logj • log logj is 
not, i.e., both our upper and lower bounds are tight. This lower bounds stands out as it shows that it is not 
always possible to replace non-prioritized distortion of a(n) by a prioritized distortion a(j). For single-tree 
embedding the non-prioritized distortion is linear, while the prioritized one is provably superlinear. 

*We believe the query time can be improved to 0(1): [Chel4] combines the oracles of [TZ05] and of [MN06] to obtain query 
time 0(1). In the full version of our paper we show that the oracle of [MN06] can be altered to give prioritized stretch, similar to 
that of [TZ05] we show here. Using the techniques of [Chel4] should thus yield prioritized stretch with 0(1) query time. 
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1.1 Overview of Techniques 

We elaborate briefly on the methods used to obtain our results. 

Distance Oracles, Distance Labeling and Routing. We have two basic techniques for obtaining distance 
oracles with prioritized stretch. The first one is manifested in Theorem 5, and the idea is as follows: Partition 
the vertices into sets according to their priority, and for each set K C V, apply as a black-box a known 
distance oracle on K, while for the other vertices store the distance to their nearest neighbor in K. We show 
that the stretch of pairs in K x V is only a factor of 2 worse than the one guaranteed for K x K. Furthermore, 
we exploit the fact that for sets K of small size, we can afford very small stretch and still maintain small 
space. The exact choice of the black-box oracle and of the partitions enables a range of tradeoffs between 
space and prioritized stretch. 

Our second technique for an oracle with prioritized stretch, used in Theorem 6, is based on a non-black- 
box variation of the [TZ05] oracle. In their construction for stretch 2t — 1, a (non-increasing) sequence of 
t — 1 sets is generated by repeated random sampling. We show that if a vertex is chosen i times, then the 
query algorithm can be changed to improve the stretch from 2t — 1 to 2(t — i) — 1, for any pair containing 
such a vertex. This observation only shows that there exists a priority ranking for which the oracle has the 
required prioritized stretch. In order to handle any given ranking, we alter the construction by forcing high 
ranked elements to be chosen numerous times, and show that this increases the space usage by at most a 
factor of 2. 

In order to build a distance labeling scheme out of their - distance oracle, [TZ05] pay an additional factor 
of 0(log 1-1 / 4 n) in the label size (which essentially comes from applying concentration bounds). Attempt¬ 
ing to circumvent this logarithmic dependence on n, in Theorem 7 we give a different bound on the deviation 
probability that depends on the priority ranking of the point. Thus the increase in the label size for the j-th 
point in the ranking is only 0(log j). To obtain arbitrary fixed stretch 2t — 1 for all pairs, in Theorem 8 we 
combine this scheme with an iterative application of a source restricted distance labeling of [RTZ05], 

Most results on distance labeling for bounded treewidth graphs, planar graphs, and graphs excluding a 
fixed minor, are based on recursively partitioning the graph into small pieces using small separators (as in 
[LT79]). The label of a vertex essentially consists of the distances to (some of) the vertices in the separator. 
In order to obtain prioritized label size, such as those given in Theorem 10 and Theorem 11, high ranked 
vertices should participate in few iterations. To this end, we define multiple phases of applying separators, 
where each phase tries to separate only certain subset of the vertices (starting with the highest ranked, and 
finishing in the lowest). This way high ranked vertices will belong to a separator after a few levels, thus their 
label will be short. 

Tree-routing of [ThoOl] is based on categorizing tree vertices as either heavy or light, depending on the 
size of their subtree. Our prioritized tree-routing assigns each vertex a weight which depends on its priority, 
and a vertex is heavy if the sum of weights of its descendents is sufficiently large. This idea paves way to 
our prioritized routing scheme for general graphs as well. 

Embeddings It is folklore that a metric minimum spanning tree (henceforth, MST) achieves distortion 
n — 1. For our prioritized embedding of general metrics (X, d) into a single tree we consider a complete 
graph G = (X , (^)) with weight function that depends on the priority ranking. Specifically, edges incident 
on high-priority points get higher weights. We then compute an MST in this (generally non-metric) graph, 
and show that, given a certain convergence condition on the priority ranking, this MST provides a desired 
prioritized single-tree embedding. Remarkably, we also show that when this condition is not met, no such 
an embedding is possible even for the metric induced by C n . Flence this embedding is tight. 
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Our probabilistic embedding to trees with prioritized expected distortion in Theorem 4 is based on the 
construction of [FRT04], The method of [FRT04] involves sampling a random permutation and a random 
radius, then using these to create a hierarchical partitioning of the metric from which a tree is built. We 
make the observation that, in some sense, the expected distortion of a point depends on its position in the 
permutation. Rather than choosing a permutation uniformly at random, we choose one which is strongly 
correlated with the given priority ranking. One must be careful to allow sufficient randomness in the permu¬ 
tation choice so that the analysis can still go through, while guaranteeing that high ranked points will appear 
in the first positions of the permutation. 

The embedding of Theorem 14 for arbitrary metrics ( X , d) into Euclidean space (or any i p space) with 
prioritized distortion uses similar ideas. We partition the points to sets according to the priorities, for every 
set K Cl apply as a black-box the embedding of [Bou85]. We show that since the embedding has certain 
properties, it can be extended in a Lipschitz manner to all of the metric, while having distortion guarantee 
for any pair in K x X. 

The result of Theorem 15, which gives prioritized distortion and dimension, is more technically involved. 
In order to ensure that high priority points are mapped to the zero vector in the embeddings tailored for the 
lower priority points, we change Bourgain’s embedding, which is defined as distances to randomly chosen 
sets. Roughly speaking, when creating the embedding for a set K, we add all the higher ranked points to the 
random sets. This means the original analysis does not work directly, and we turn to a subtle case analysis 
to bound the distortion; see Section 8.2 for more details. 

1.2 Organization 

After a few preliminary definitions, we show the single tree prioritized embedding in Section 3, and the 
probabilistic version in Section 4. In Section 5 we discuss our prioritized distance oracles, and in Section 6 
the prioritized labeling schemes. The prioritized routing is shown in Section 7. Finally, in Section 8 we 
present our prioritized embedding results into normed spaces. 

2 Preliminaries 

All the graphs G = ( V, E ) we consider are undirected and weighted. Let x\,... ,x n € V be a priority 
ranking of the vertices. Let do be the shortest path metric on G, and let a, /3 : [rz] —>• M + be monotone 
non-decreasing functions. 

A distance oracle for a graph G is a succinct data structure, that can approximately report distances 
between vertices of G. The parameters of this data structure we will care about are its space, query time, 
and stretch factor. We always measure the space of the oracle as the number of words needed to store it 
(where each word is O(logn) bits). The oracle has prioritized stretch a(j), if for any 1 < j < i < n, when 
queried for Xj,Xi the oracle reports a distance d(xj,Xi) such that 

d G {xj,Xi ) < d(xj,Xi ) < a(j) ■d G (xj,Xi ) . 

Some oracles can be distributed as a labeling scheme, where each vertex is given a short label, and the 
approximate distance between two vertices should be computed by inspecting their labels alone. We say 
that the a labeling scheme has prioritized label size 0{j), if for every j e [n], the label of Xj consists of at 
most 0( j) words. See Section 7 for the precise settings of routing that we consider. 
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Figure 1: An illustration for the algorithm presented during the proof of Theorem 1. We are given a metric 
space over X = {x\,X 2 ,X 3 ,Xi}, with the function a(l) = 2,a(2) = 4, a(3) = 8,ct(4) = 16. In the 
first step we assign new weights over the edges, then find an MST in the new graph, and finally, restore 
the original weights. For example the original distance between V 2 ,V 3 was 2, while in the returned tree the 
distance is 7. Hence the pair V 2 , V 3 suffers distortion 3.5 < 4. 


Let (X, dx) be a finite metric space, and let .X]...., x n he a priority ranking of the points in X. Given 
a target metric (Y. dy ), and a non-contractive map / : X —y Yfi we say that / has priority distortion a(j) 
if for all 1 < j < i < n, 

dY{f{xj),f(xi)) < a(j ) • d x (xj,Xi ) . 

Similarly, if / : X —y Y is non-expansive, then it has priority distortion a(j) if for all 1 < j < i < n, 
dy{f(xj), f(xi )) > dx(xj,Xi)/a(j). For probabilistic embedding, we require that each map in the support 
of the distribution is non-contractive, and the prioritized bound on the distortion holds in expectation. 

In the special case that the target metric is a normed space £ p , we say that the embedding has prioritized 
dimension f3(j), if for every j E [n], only the first /3(j) coordinates in f(xj ) may be nonzero. 

3 Single Tree Embedding with Prioritized Distortion 

In this section we show tight bounds on the priority distortion for an embedding into a single tree. The 
bounds are somewhat non-standard, as they are not attained for a single specific function, but rather for the 
following family of functions. Define <I> to be the family of functions a : N —> M + that satisfy the following 
properties: 

• a is non-decreasing. 

• E,=t 1 /«(*) < 1. 

3.1 Upper Bound 

Theorem 1. For any finite metric space ( X , d) and any a: G 'h, there is a (non-contractive) embedding of 
X into a single tree with priority distortion 2 a(j). 

Proof. Let xi,..., x n be the priority ranking of X, and let G = ( X , E) be the complete graph on X. For 
e = {u,v} E E, let t(e) = d(u,v). We also define the following (prioritized) weights w : E —> M, for 
any 1 < j < i < n the edge e = {xj,Xi} will be given the weight w(e) = a(j) ■ £(e). Observe that the 
w weights on G do not necessarily satisfy the triangle inequality. Let T be the minimum spanning tree of 

2 The map / is non-contractive if for any u,v € X, dx(u, v) < dy(/(w), 
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(X, E, w) (this tree is formed by iteratively removing the heaviest edge from a cycle). Finally, return the 
tree T with the edges weighted by l. We claim that this tree has priority distortion a(j). 

Consider some Xj. x % G X, if the edge e = { x 3 , .x,} G F(T) then clearly this pair has distortion 1. 
Otherwise, let P be the unique path between Xj and x r in T. Since e is not in T, it is the heaviest edge on 
the cycle P U {e}, and for any edge e' G P we have that w(e') < w(e) = a(j) ■ d(xj,Xi). Consider some 
Xk G X, and note that there can be at most 2 edges touching x & in P. If e' G P is such an edge, and its 
weight by w was changed by a factor of a(k), then a(k) ■ l{e') < a(j ) • d(xj,Xi). Summing this over all 
the possible values of k we obtain that the length of P is at most 


e'EP k =1 


q(j) 

a(k) 


■ d(xj,Xi ) < 2 a(j) ■ d(xj,Xi) . 


( 1 ) 


□ 


Corollary 1. For any fixed 0 < e < 1/2, one can take the function a : N — > M defined by a( 1) = 1 + e, 
and for j > 2, a(j) = ^ los /^ —, which lies in <f>/or c ~ e 2 , and obtain priority distortion O (/(log j) 1+e ). 
Furthermore, the distortion of the pairs containing x \ is only 1 + 3e. 

Proof The fact that a G $ follows by noting that f fjpj = e-1 ~ g c c x + C. To see the small distortion for pairs 
x\,Xi, observe that in the case {x \, x r } f T, the first edge of the path P from x \ to x t has weight at most 
d(x\,Xi), while none of the other edges on P is touching x\. Furthermore, since l/a(l) > 1 — e, we have 
that Y1T= 2 1 /cv(A:) < e, and so so we can replace (1) by 


F, ^ e ') - d(xi,Xi) + 2 

e'eP k =2 


«(!) 

a(k) 


■ d(x\,Xi) < (1 + 3e) • d(xi,Xi) . 


W 


3.2 Lower Bound 

Here we show a matching lower bound (up to a constant, which is only 2 for trees without Steiner nodes 3 on 
the possible functions admitting an embedding into a tree with priority distortion. We first show that a (non¬ 
decreasing) function which is not in cannot bound the priority distortion in a spanning tree embedding. 
Then using an argument similar to that of [GupOl], we extend this for arbitrary dominating trees, 4 while 
losing a factor of 8 in the lower bound. 

Theorem 2. For any non-decreasing function a : N —> M with a (j 'h, there exists an integer n, a graph 
G = ( V. E) with V | = n vertices, and a priority ranking ofV, such that no spanning tree ofG has priority 
distortion less than a. 

Proof Since a ^ there exists an integer n' such that { l/a(i) > 1. Take some integer n > n' such 
that a ,k +1 is an integer for all 1 < i < n' (assume w.l.o.g that the a(i) are rational numbers). Then let 
G = C n , a cycle on n points with unit weight on the edges. Clearly, a spanning tree of C n is obtained by 

3 We say that the target tree has Steiner nodes if it contains more vertices than the original graph. 

4 A tree T dominates a graph G if dr > <fc. 
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Figure 2: An illustration for the proof of Theorem 2. As all the pairs containing Xi cannot suffer distortion 
greater than a(i), all the edges of distance at most a, from x t cannot be deleted from the tree. As ff a i> n > 
placing xi,X 2 , ■ ■ ■ so that the relevant sets of edges are disjoint and cover all the edges, there is no edge that 
can be deleted. 

removing a single edge, thus we will choose the priorities xi,... ,x n E IF in such a way that no edge can 
be spared. 

Seeking contradiction, assume that there exists a spanning tree with priority distortion less than a. Let 
x\ be an arbitrary vertex, and note that if u is a vertex within distance a\ = n/(a( 1) + 1) from x\, then 
all the edges on the shortest path from x\ to u must remain in the tree. Otherwise, the distortion of the pair 
{xi, u} will be at least "//' = a( 1). There are a ^” +1 such edges that must belong to the tree (since we 
consider vertices from both sides of x\). Now take x -2 to be a vertex at distance Q ^ 1 " +1 + a ^)+\ from x\. By 
a similar argument, the edges closest to X 2 must be in the tree as well. Observe that these edges form 

a continuous sequence on the cycle with those edges near xi. Continue in this manner to define xs,..., x n ', 
and conclude that there are at least 



( 2 ) 


edges that are not allowed to be removed, but this is a contradiction, as there are only n edges in C n . □ 

Theorem 3. For any non-decreasing function a : N — > M with a / T, there exists an integer n, a metric 
(X, d) on n points and a priority ranking x \, x n E X, such that there is no embedding of X into a 
dominating tree metric with priority distortion less than a/8. 

Proof. Take n, the metric (X. d) induced by C n , and the same priority ranking as in Theorem 2. First 
consider any tree T with exactly n vertices, but which is not necessarily spanning. That is, T is allowed 
to have edges that did not exist in C n . Since T must be dominating, we may assume that an edge in T 
connecting vertices of distance k in C n will have weight exactly k (if it has larger weight, reducing it to k 
can only improve the distortion). We extend an argument of [GupOl] to prove that the priority distortion of 
T is at least a. 
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The argument at Section 7 of [GupOl] says that T can be replaced by a tree T' satisfying d < drp' < dr, 
and such that any vertex in T' has at most one edge to its left semicircle and one edge to its right semicircle. 5 
A crucial observation (made in [GupOl]) is that for any pair of vertices at distance k in C n , their distance 
in T' can be either k or at least n — k. Now we may use a similar reasoning as in the proof of Theorem 2; 
Assume that x\ is the 7-th vertex of C n , and observe that any vertex i + j for 1 < j < a\, must be connected 
by an edge to one of the vertices *, i + 1 , ..., i+j — 1 , as otherwise dr>(i,i+j) > n — a\, and the distortion 
of the pair {x\,j} will be at least a( 1). Notice that the edges x-> forces to exist are disjoint from those of 
x\ . It follows that for each 1 < i < n' , Xi forces at least a ^ +1 disjoint edges to be in the tree, which is 
impossible due to (2). 

Finally, consider arbitrary dominating tree metrics, which may have Steiner nodes (nodes which no 
vertex of C n is mapped onto). By a result of [GupOl], such nodes may be removed while increasing the 
distance between any pair of points by at most 8, so we conclude that such a tree cannot have priority 
distortion less than a/8. 

□ 


4 Probabilistic Embedding into Ultrametrics with Prioritized Distortion 

An ultrametric (U, d) is a metric space satisfying a strong form of the triangle inequality, that is, for all 
x,y,z E U , d(x, z ) < max{d(x, y),d(y , z)}. The following definition is known to be an equivalent one 
(see [BLMN05]). 

Definition 1. An ultrametric U is a metric space ( U. d) whose elements are the leaves of a rooted labeled 
tree T. Each z E T is associated with a label + (z) > 0 such that if q E T is a descendant of z then 
<b (q) < $ (z) and $ (q) = 0 iff q is a leaf The distance between leaves z,q E U is defined as dx(z, q) = 
<h (lea ( z , q)) where lea (z, q) is the least common ancestor of z and q in T. 


Theorem 4 . For any metric space (X, d), there exists a distribution over embeddings of X into ultrametrics 
with expected prioritized distortion O(logj). 


Proof Let x\,... ,x n be the priority ranking of X, and let A be the diameter of X. We assume w.l.o.g 
that the minimal distance in X is 1, and let 5 be the minimal integer so that A < 2 s . We shall create a 
hierarchical laminar partition, where for each i E {0,1,..., 5}, the clusters of level i have diameter at most 
2*, and each of them is contained in some level i + 1 cluster. The ultrametric is built in the natural manner, 
the root corresponds to the level 5 cluster which is X, and each cluster in level i corresponds to an inner 
node of the ultrametric with label 2*, whose children correspond to the level i — 1 clusters contained in it. 
The leaves correspond to singletons, that is, to the elements of X. Clearly, the ultrametric will dominate 
(X,d). 

In order to define the partition, we choose a random permutation tt : X —> [n] which is strongly 


correlated with the priority ranking, and in addition we choose some number /3 E [1,2]. Let Kq = {x\. .xV}, 


and for any integer 1 < j < [log log n] let Kj = {.xy, 


< h < 2 2 }. The permutation ir 


is created by choosing a uniformly random permutation on each J\j, and concatenating these. Note that 


7T 


“({ 


he N : he (2 


> 2 ’- 




}) 


= Kj, and vr -1 ({1, 2}) = Jx 0 . 


Tf the vertices of C„ are labeled 0, 1 ,..., n — 1 as ordered on the cycle, the right semicircle of vertex i is {i + 1 , i + 2,... * + 
[n/2j} (addition is modulo n), and the left semicircle isV’\{i,i + l,i + 2,...i + [n/2j}. 





In each step i, we partition a cluster S of level i + 1 as follows. Each point x E ,5' chooses the point 
tt E X with minimal value according to it among the points of distance at most fa := [3 ■ 2 l ~ 2 from and 
joins to the cluster of tt. Note that a point may not belong to the cluster associated with it, and some clusters 
may be empty (which we can discard). The description of the hierarchical partition appears in Algorithm 1. 

Algorithm 1 Modified FRT(A', it) 

1: Choose a random permutation tt : X -> [n] as above. 

2: Choose (3 E [1, 2] randomly by the distribution with the following probability density function p (x) = 

l 

a; In 2 " 

3: Let D$ = X; i «— 5 — 1. 

4: while Di + 1 has non-singleton clusters do 
5: Set 2 i ~ 2 . 

6: for l = 1,..., n do 

7: for every cluster S’ in /> (+] do 

8: Create a new cluster in Di, consisting of all unassigned points in S closer than fa to tt ( l ). 

9: end for 

10: end for 

11: i «- i - 1. 

12: end while 


Let T denote the ultrametric created by the hierarchical partition of Algorithm 1, and dr (tt, v ) the 
distance between tt to v in T. Consider the clustering step at some level i, where clusters in -Dj+i are picked 
for partitioning. In each iteration l, all unassigned points z such that d (z , vr(Z)) < fa, assign themselves to 
the cluster of tt{1). Lix an arbitrary pair {tt, tt}. We say that center w settles the pair {tt, u} at level i, if it 
is the first center so that at least one of u and v gets assigned to its cluster. Note that exactly one center w 
settles any pah - {u, tt} at any particular level. Lurther, we say that a center w cuts the pair {t;, tt} at level i, 
if it settles them at this level, and exactly one of tt and v is assigned to the cluster of w at level i. Whenever 
w cuts a pair (t;, tt} at level i, dx (v, tt) is set to be 2* +1 < 8fa. We blame this length to the point w and 
define d' 1 ^ (v, u ) to be 1 (tt; cuts (t>, tt} at level i) ■ 8fa (where 1 (•) denotes an indicator function). We 

also define d^ J (v, tt) = Y1w<ek 3 dT (' v > u )• Clearly, dr (v, u ) < ^ ( v > u )• 

K 

Lix some 0 < j < [log log n], our next goal is to bound the expected value of d T 3 ( v , u) by O (log (|iTjj)). 
We arrange the points of Kj in non-decreasing order of their distance from the pair {v, tt} (breaking ties 
arbitrarily). Consider the sth point w s in this sequence. W.l.o.g assume that d (w s , v) < d (w s , tt). Lor a 
center w s to cut {v, tt}, it must be the case that: 

1. d (w s , v ) < fa < d ( w s ,u ) for some i. 

2. w 3 settles {tt, tt} at level i. 

Note that for each x € [d (w s ,v ), d ( w s ,u )), the probability that fa € [x,x + dx) is at most Con¬ 

ditioning on fa taking such a value x, any one of w± ,..., w s can settle { /;, u}. The probability that w s is 
first in the permutation 7r among tt’i.... w s is (In fact, there may be points from Uo<r<j ^ at scl bc 
{t;, u } before w s . It is safe to ignore that, as it can only decrease the probability that w s cuts {v, tt}.) Thus 
we obtain, 

rd(w a ,u) dr 1 8 16 

E[d^ s (v,u)] < / 8x ■ ——- • - = — —(d(w s ,u)-d(w s ,v))< - d(v,u) . (3) 

Jd{w s ,v) x In 2 s s • In 2 s 
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Hence we conclude, 


(3) ]Kj1 1 

E [d^ ] (v,u)\ < ^2 E[d^ s (v, it)] < lQd(v,u)^2 ~ = log|lkj| 

WsEKo 5—1 


0(d(v, u)) 


(4) 


Assume v = is the h-th vertex in the priority ranking for some h > 2. Let a be the integer such that 
v £ K a , and recall that 2 2 “ < h < 2 2 “, i.e., 2“ < 2 log h. The crucial observation is that if y £ Kb such 

that b > a, then y cannot settle {n, u }. The reason is that v always appears before y in 7r, so v will surely 
be assigned to a cluster when it is the turn of y to create a cluster. This leads to the conclusion that for all 

b > a, K[d^ b (v, it)] = 0. We conclude: 


E [d T (v,u)] 


< ^E [d%(y,u)\ 
3=0 


0(d(v, 



i=o 

0(d(v, 

u))i>2 log i 2 * 


3=0 

0(d(v, 

u))J2 2j 


3=0 

0(d(v, 

«)) • 2“ 

0(d(v, 

u)) ■ log h . 


When h £ {1, 2} we can take a = 0, and thus obtain a bound of 0(d(v, u)). 




5 Distance Oracles with Prioritized Stretch 

In this section we consider distance oracles where the stretch scales with the priority of the vertices. See 
Section 2 for the basic definitions. A classical result of [TZ05] (with improved query time due to [Che 14]), 
asserts that for any parameter t > 1 and any graph on n vertices, there exists a (2t — l)-stretch distance 
oracle of space 0(t ■ n 1 + 1/,/ ) with 0(1) query time. An additional important result of [MN06] allows for 
very small space: their oracle has space 0(n i + l/l ) with stretch 0(t), and 0(1) query time as well. 


5.1 Prioritized Stretch with Small Space 


Our first result provides a range of distance oracles with prioritized stretch and extremely low space. They 
also exhibit a somewhat non-intuitive (although very good) dependence of the stretch on the priority of the 
vertices. The drawbacks of these oracles arc that they cannot report the approximate paths in the graph 
between the queried vertices, and it is not clear that they can be distributed as a labeling scheme. 


For the sake of brevity, denote by r(j) = 
function / : N 


logn 


_l°g(n/i) 

N, define its iterative application F : N 
as F(k) = f(F(k — 1)). That is, F(k) is determined by iteratively applying / for k times starting at 1. 


(where n is always the number of vertices). For a 
N as follows: F(0) = 1, and for integer k > 1 
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Theorem 5. Let G = (V, E) be a weighted graph on n vertices. For any positive integer T, let f : N — >■ R + 
be any monotone increasing function such that /(1) = 2 and F(T ) > log' n. Then there exists a distance 
oracle that requires space 0( Ylk=i F(k) ■ n) and has prioritized stretch 

iriin {4/ (t(j)) - 5, log n} . 

Alternatively, one may obtain a distance oracle with space 0{T ■ n) and prioritized stretch 

min {O (/ (r(j))), log n} . 


Both oracles have 0(1) query time. 


Corollary 2. Any weighted graph G = ( V. E ) on n vertices admits distance oracles with the following 
possible tradeoffs between space and prioritized stretch. 

1) Space 0(n log 2 n) and prioritized stretch inin{4r(j) — 1, log n}; 

2) Space 0[n log n) and prioritized stretch min{8r(j) — 5, logn}; 

3) Space 0(n log log n) and prioritized stretch min{0 (r(j)), log n}; 

4) Space 0{n log log log n) and prioritized stretch minjO (t(}) 2 ) , logn}; 

5) Space 0(n log* n) and prioritized stretch min{0(2 T ^), logn}. 

Observe that the first two oracles have stretch 3 for all points of priority less than y/n, and that in all of 
these oracles, for any fixed e > 0, all vertices of priority at most n ] f have constant stretch. 

Proof of Corollary 2. All the tradeoffs follow by simple choices for T and /, which are described in the 
next bullets. 

• For the first tradeoff let T = log n (assume w.l.o.g this is an integer), and take the function f(k) = k+ 
1, so that F(k) = k + 1 as well for all k. Thus the space is indeed 0(n-^2k =1 (k+ 1)) = 0(n log 2 n), 
and the prioritized stretch is 4 r(j) — 1 by the first assertion of Theorem 5. 

• For the second tradeoff, using T = log logn, it suffices to take f(k) = 2k, so that F(k) = 2 k . The 
space is now 0{n ■ Ylk=i = O(nlogn) and the prioritized stretch is as promised applying the 
first assertion of Theorem 5 again. 

• In the third tradeoff we use again T = log logn, f(k) = 2k and F(k) = 2 k . This time using the 
second assertion of Theorem 5, the space is 0(n log log n), and the prioritized stretch is O (r(j)). 

• In the fourth tradeoff we use T = 1 + log log logn, and let /(l) = 2 and for k > 2, f(k) = 

k 2 . It implies that F(k) = 2 2 , so by the second assertion of Theorem 5 the space is indeed 

0(n log log log n), and the prioritized stretch follow similarly. 

• The final tradeoff holds by taking T = log* n — 1, and setting f(k) = 2 k , so that F(k') = tower (k). 6 
The bounds on the space and the prioritized stretch follow as before. 


□ 


We now turn to proving the theorem, and start with the following lemma. 

6 tower(fc) is defined as tower(O) = 1 and tower(fc) = 2 tower ( fc-1 i, so that towerflog* n) = n. 
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Lemma 1. For any t > 1, and any graph G = (V. E) on n vertices with a subset K C V of size \K\ = k, 
there exists a distance oracle which can answer in 0(1) time queries on every pair in K x V with either: 

• Stretch 4 1 — 1, using space 0(t ■ k 1+1 ^ + n). 

• Stretch 0(t), using space 0{k l+1 / t + n). 

Proof. For the first assertion, apply the distance oracle of [Che 14] on the complete graph G' = ( K. E') with 
parameter t, where the weight of each edge in FJ is the shortest path distance in G between its endpoints. 
This gives stretch 2t — 1 for any pair in K x K and requires space 0(t ■ A; 1 '* ’■ // ). For every vertex u E 
V \ K, store only d G (u, K) and the name of the vertex k u E K that manifests this distance (that is, 
d G (u, k u ) = d(;(u. K )). We obtain a data structure of space (){t ■ k 1 + 1 ' L + n). To answer a distance query 
between v E K and u E V, report d(v,k u ) + d G (k u ,u ) where d is the distance reported by the oracle 
of G'. It remains to bound the stretch: Observe that since k u is the closest vertex to u in I \, we have that 
d G (v, k u ) < d G (v, u ) + d G (k u , u ) < 2 d G (u, v ), and thus the reported distance is bounded as follows, 

d(v, k u ) + d G (k u , u) < (2 1 - 1 )d G (v, k u ) + d G (u, v ) < (4 1 - l)d G (u, v ) . 

Using the triangle inequality, the reported distance is never larger than the original, 

d(v, k u ) + d G (k u , u) > d G (v, k u ) + d G (k u , u) > d G (u, v ) . 

The second assertion follows by applying the oracle of [MN06] rather than that of [Che 14], which yields 
stretch 0(f) on K x V, and space 0{k l+l ^ + n), by a similar argument. □ 

We are finally ready to prove Theorem 5. 

Proof of Theorem 5. We begin with the first assertion of the theorem. Let x\,, x n E V the priority 
ranking of V. For each i E [T], let S) = {x 3 : 1 < j < n l ~ 1/ ?7( T }, and apply the first oracle of Lemma 1 
on G with the set S) and parameter t, = F(i) — 1, let ()-, be the resulting oracle. 7 Also invoke the oracle 
Omn of [MN06] on G, that has stretch logn on all pairs using only 0(n) space (with 0(1) query time). 

Observe that for each i E [T], the stretch t t was chosen so that (1 — 1 /F{i)) ■ (1 + 1 /tf) = 1, so that 
the oracle O, has space 

0(t i -\S i \ 1+1 / ti +n) = 0(F(i)-n). 

The total space is thus 0( Ya=i • n ), as promised. It remains to prove the prioritized stretch guarantee. 
Fix any v = Xj, and let i be the minimal such that Xj E S, (observe that if j > to/ 2 there is not necessarily 
any such i). For i = 1 the stretch guaranteed by 0\ is 4i* — 1 = 4(F(1) — 1) — 1 = 3, as promised (recall 
that f(k) > 2 for all k > 1, so the required stretch is never smaller than 3). For i > 1, by minimality of i 
it follows that j > to 1-1 /^ -1 ), that is, F(i — 1) < 
stretch of O, for v with any other point is at most 

4 (F(i) - 1) - 1 = 4 F(i) - 5 = 4 f(F(i - 1)) - 5 < 4/ (r(j)) - 5 , 

while the stretch of Omn is at most logn for all pairs, which handles the case no i exists, and allows us to 
report the minimum of the two terms. The query time is clearly 0(1). 

The proof of the second assertion is very similar, the only difference is using the second oracle given 
by Lemma 1. This implies oracle O, has space 0(n), and thus the total space is only 0(T ■ to). Albeit the 
stretch of this oracle is worse by a constant factor. □ 

7 Since F( 0) = 1 and / is strictly monotone, it follows that F(i) > 2 for all i > 1, so that ti > 1. 


logn 

log(n/j) 


= r(j) (since F(i — 1) is an integer). The 
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5.2 Prioritized Distance Oracles with Bounded Prioritized Stretch 

In this section we prove the following theorem, which prioritized the stretch of the distance oracle of [TZ05], 
Unlike the oracles of Theorem 5, this oracle can also support path queries, that is, return a path in the 
graph that achieves the required stretch, in time proportional to its length (plus the distance query time). 
Additionally, it can be distributed as a labeling scheme, which we exploit in the next section. Furthermore, 
this oracle matches the best known bounds for the worse-case stretch of [TZ05], which are conjectured to 
be optimal. 

Theorem 6. Let G = (V. E ) be a graph with n vertices. Given a parameter t > 1, there exists a distance 
oracle of space 0(tn l+ld ) with prioritized stretch 2 - 1 and c i uer y time °( r^ri )• 

Overview Recall that in the distance oracle construction of [TZ05], a sequence of sets V = Aq D A i D 
■ ■ ■ M At = 0 is sampled randomly, by choosing each element of Aj_i to be in Aj with probability n -1 ^. 
We make the crucial observation that the distance oracle provides improved stretch of 2(t — i) — 1, rather 
than 2t — 1, to points in A,. However, as these sets are chosen randomly, they have no correlation with 
our given priority list over the vertices. We therefore alter the construction, to ensure that points with high 
priority will surely be chosen to A* for sufficiently large i. 

Proof of Theorem 6. Let x\, ..., x n G V be the priority ranking of V. For each i G {0.1..... t — 1} let 
Si = {xj : 1 < j < ra 1- */*}. Let Aq = V, A t = 0, and for each 1 < i < t — 1 define A' by including 
every element of A,_i with probability n -1 /*/2, and let Aj = A' U £). For each v G V and 0 < i < t — 1, 
define the i-th pivot pi{v) as the nearest point to v in Aj, and Bi(y) = {w G Aj : d(v, w) < d(v, Aj + i)}. 8 
Also the bunch of v is defined as B(v) = Uo<i<t-i ^i( v )- The distance oracle will store in a hash table, 
for each v G V, all the distances to points in B(v), and also the Pi(v) vertices. 

The query algorithm for the distance between u, v is essentially the same as in [TZ05], with the main 
difference is that we start the process at level i rather than level 0, for a specified value of i. 


Algorithm 2 Dist (v,u,i) 

1: w <— v, 

2 : while w fi B{u) do 
3: i i — i T L 

4: (u,v) <- (v,u); 

5: W<-Pi(v); 

6 : end while 

7: return d(w , u) + d(w , v); 


Stretch. Let v = Xj be the j-th point in the ordering for some j > 1, and fix any u G V. (Observe that 
every vertex of A t _i lies in all the bunches, so when considering x\ G A t - 1 , we have that x\ G I Mu) 
and so Algorithm 2 will return the exact distance.) Let 0 < i < t — 1 be the integer satisfying that 
n \-(i+\)/t, j < that is, the maximal i such that v G 5). By definition we have that v G Aj as well, 

so we may run Dist(u, u, i). Assuming that all operations in the hash table cost 0(1), the query time is 
0{t — i). The stretch analysis is similar to [TZ05]: let Uk, Vk and Wk be the values of u, v and w at the k- th 

8 We assume that d(v,0) = oo (this is needed as At = 0). 
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iteration, it suffices to show that at every iteration in which the algorithm did not stop, d(vk,Wk ) increases 
by at most d(u, v ). It suffices because there are at most t — 1 — i iterations (since wt-i G At- 1 , it lies in all 
bunches), so if £ is the final iteration, it must be that d(vg, wg) < {£ — i) ■ d(u , v) (initially d(wt, Vi) = 0), 
and by the triangle inequality d(wg, ug) < d(u, v ) + d(vg, wg ) < (£ — i + 1) • d(u, v ), and as £ < t — 1 we 
conclude that 

d(w, u ) + d(w, v ) < (2 (t — i) — 1) • d(«, v) . 

To see the increase by at most d(u, v ) at every iteration, we first note that u;,; = v t £ /l, (this fact enables 
us to start at level i rather than in level 0). In the A;-th iteration, observe that as w k B(u k ) but w k £ A k , 
it must be that d(uk,Pk+i(uk)) < d(uk,Wk). The algorithm sets Wk+i = pk+i{uk), Vk+i = Uk and 
Uk+i = Vk, so we get that 

d(v k+ i,w k+ i) = d(u kl p k+ i{u k )) < d(u k ,w k ) < d(u k ,v k ) + d(v k ,w k ) = d(u,v) + d(v k ,w k ) • 

Note that as ra 1 -fi+ 1 )/t < j < it follows that t — i — 1 < < t — i, so that t — i = [~ *og g n' l- 

The guaranteed stretch for pairs containing x 3 is thus bounded by 21~ ] — 1 (or stretch 1 for x\ ). 

Space. Fix any u £ V, and let us analyze the expected size of B(u). Fix any 0 < i < t — 2, and consider 
Bi(u). Assume we have already chosen the set A t , and arrange the vertices of A* = {ai,.. . a m } in order of 
increasing distance to u. Note that if a r is the first vertex in the ordering to be in A, + i, then /i, (u) = r — 1. 
Every vertex of A* is either in and thus will surely be included in A I+I , otherwise it has probability 
n~ 1 / t /2 to be in A' +1 and so in A« + i as well. The number of vertices that we see until the first success 
(being in Aj + i) is stochastically dominated by a geometric distribution with parameter p = n~ 1 ^/2, which 
has expectation of 2n, l,// \ For the last level t — 1, note that each vertex in S L \ S l+ \ has probability exactly 
(n — n i+(*+i )/t/2 t 1 1 to be included in A f _i, independently of all other vertices. As 

| Si \ S l+ 1 1 < | Sj| = n 1- ®/*, the expected number of vertices in A t -\ is 

nl ~ i,t ■ n~ 1+{i+1)/t / 2 t - 1 “ i < 2n 1 / t . (5) 

i=0 

This implies that E[|5i_i(rt)|] < 2 n 1 ^ as well, and so E[|f3(u)|] < 2 1 ■ n 1 ^. The total expected size of all 
bunches is therefore at most 2 1 ■ n l + l/,/ . 

v-ii 


6 Prioritized Distance Labeling 

In this section we discuss distance labeling schemes, in which every vertex receives a short label, and it 
should be possible to approximately compute the distance between any two vertices from then - labels alone. 
The novelty here is that we would like ’’important” vertices, those that have high priority, to have both 
improved stretch and also short labels. 

6.1 Distance Labeling with Prioritized Stretch and Size 

We begin by showing that the stretch-prioritized oracle of Theorem 6 can be made into a labeling scheme, 
with the same stretch guarantees, and small label for high ranking points. The result has some dependence 
on n in the label size, and it seems to be interesting particularly for large values of t. Indeed, we shall 
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use this result with parameter t = log n in the sequel, to obtain fully prioritized label size which will be 
independent of n, and can support any desired maximum stretch. Furthermore, this result is the basis for 
our routing schemes with prioritized label size and stretch. 


Theorem 7. For any graph G = (V, E) with n vertices and any t > 1, there exists a distance labeling 
scheme with prioritized stretch 21~ ] — 1 and prioritized label size ■ logj). 

Proof. Using the same notation as Section 5, the label of vertex v E V consists of its hash table (which 
contains distances to all points in the bunch B{v), and the identity of the pivots pfv) for 0 < i < t — 1). 
Note that Algorithm 2 uses only this information to compute the approximate distance. The stretch guarantee 
is prioritized as above, and it remains to give an appropriate bound on the label sizes. 

Let xi,... ,x n E V be the priority ranking of V. Fix a point v = Xj for some j > 1, and let i be the 
maximal such that v E Si. Note that this implies that t—i—1 < ■ Observe that Bq(v)\J- ■ i (v) = 

0, so it remains to bound the size of Bfv ),..., B t -\(v). For the last set B t -\(v) = A t - 1 , let £ be the event 
that |A t _i| < 8 n 1 ^. We already noted in (5) that the expected size of A t -\ is at most ‘Inf! 1 , thus using 
Markov, with probability at least 3/4 event £ holds. 

For i < k < t — 2, let Xj. be a random variable distributed geometrically with parameter p = n 1 ,// '/2, 
thus E[A/c] = 2n 1 /// - for all k. We noted above that the distribution of Xj. is stochastically dominating the 
cardinality of Bj.(v). thus it suffices to bound -^-k- Observe that for any integer s, if > s 

then it means that in a sequence of s independent coin tosses with probability p for heads, we have seen less 
than t — 1 — i heads. That is, if Z ~ Bin(s,p) is a Binomial random variable then 


rt -2 


Pr 


^ ^ Xk S s 
_ k=i 


Pr [Z < t — 1 — i] < Pr 


Z < 


flogj 
log n 


< Pr [Z < log j] . 


Take s = 1 (in 1 //- • log j (assume this is an integer), so that // := E [Z\ = 8 log j, and by a standard Chernoff 
bound 


Pr [Z < log j] = Pr[Z <p/8}< e~ 3fl/8 < 1/j 3 . 


ULo B k(xj) > 16 n 1/4 • log j j, then by a union bound over all 2 < j < n 


Let T = 13 2 < j < n 

(note that the bound is non-uniform, and depends on j), we obtain that 


PrfJ 7 ] < J^Pr 

j =2 


t—2 


k =0 


> 16n 1//<: • log j 


< E l / ji < X /4 • 

3 =2 


We conclude that with probability at least 1/2 both events £ and T hold, which means that the size of the 
bunch of each Xj is bounded by ■ log j), as required. (Recall that x\ E A t - 1 , so its label size is 

1-At-iJ < Sn 1 when event £ holds.) 

□ 


Corollary 3. Any graph G = (V, E) has a distance labeling scheme with prioritized stretch 2|~logj~| — 1 
and prioritized label size O(logj). 


6.2 Distance Labeling with Prioritized Label Size 

In this section we construct a labeling scheme in which the maximum stretch is fixed for all points, and the 
label size is fully prioritized and independent of n. 

Theorem 8. For any graph G = ( V. E ) and an integer t > 1, there exists a distance labeling scheme with 
stretch 2t — 1 and prioritized label size 0(j l,i ■ log j). 
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Proof Overview. The idea is to partition the vertices into m := \sets Si,..., S m , and to apply the 
result of Section 6.1 in conjunction with a variation of the source-restricted distance oracles of [RTZ05], 
using a labeling scheme rather than an oracle. In a source restricted labeling scheme on X with a subset 
SCI, only distances between pairs in S x X can be queried. Replacing the source restricted oracle with 
a labeling scheme, demands that we use an analysis similar to Section 6.1 to guarantee a prioritized bound 
on the label sizes. We will apply this for each i G {2,3,..., m} with X = Si U • • • U S m and the subset Si. 
Thus an element of S, will have a label which consists of i schemes, and we will guarantee that their sizes 
form a geometric progression, so that the total label size is sufficiently small. 

As it turns out, the construction of [RTZ05] is inadequate for the first 2* elements Sj, which have 
very strict requirement on their label size. We will use the construction of Section 6.1 to handle distances 
involving the elements in S Fortunately, the stretch incurred by this construction is 2[logy] — 1 which is 
bounded by 2t — 1 for the first 2 1 elements in the ranking. We begin by stating the source-restricted distance 
labeling, based on [RTZ05]. 

Theorem 9. For any integer t > 1, any graph G = (V, E ) and a subset S C V, there exists a source- 
restricted distance labeling scheme with stretch 2t — 1 and prioritized label size 0(|S| l,// • log j). 

Proof. The observation made in [RTZ05] is that to obtain a source-restricted distance oracle, it suffices to 
sample the random sets S = Ao 2 A\ D • • ■ D A t = 0 only from S, where each element of is 
included in A t independently with probability Sj 1,/,: . They show that defining the bunches as in [TZ05], 
the resulting stretch is 2t — 1 for all pairs in S x V. We shall use a similar analysis as in Theorem 7 to argue 
that this can be made into a labeling scheme. The expected label size is 0(|S'| 1 / t ), and we can show that 
with constant probability, every point Xj pays only an additional factor of O(logj). As the proof is very 
similar, we leave the details to the reader. □ 

Proof of Theorem 8. Let Si = { x 3 : 1 < j < 2 t }, and for each ig{ 2, 3,..., m} let S t = {xj : 2^~ 1 ' >t < 
j < 2 lt ). We have a separate construction for i = 1 and for i > 1. For the case i = 1, use the labeling 
scheme of Corollary 3 on G = (V, E). For each 2 < i < m, apply Theorem 9 on G and the subset Si, but 
append the resulting labels only for vertices in S) U • • • IJ S m . 

Fix any u,v £ V, and w.l.o.g assume that v £ Si has higher rank than u. This suggests that u £ 
Si U • • • U S m , thus the source restricted labeling scheme for Sj guarantee stretch at most 27 — 1 for the pair 
u, v (and u indeed stored the appropriate label). Note that in the case of v = Xj £ Si, the stretch can be 
improved to 2 [log j] — 1 (recall that logj < t). 

We now turn to bounding the label sizes. First consider v = Xj £ Si, then it must be that j < 2 t . The 
label size of v is by Corollary 3 at most O(logj), and this is the final label of v. For v = Xj £ Si when 
i > 2, the label of v consists of labels created for the sets Si ,..., Sj. Notice that 2^*" < j < 2 tl , so it 

holds that 2 l = (2* • < 2J 1 /*. By Corollary 3 the label due to Si is at most 0(log j), and using 

Theorem 9 the label size of v is at most 

i i 

O(logj) + ^0(|S fc | 1/i: • logy) = O(logj) • ^2 k = 0(2* • logj) = O^j 1 ^ ■ logj) . 

k =2 k =1 


□ 


16 


6.3 Prioritized Distance Labeling for Graphs with Bounded Separators 

6.3.1 Exact Labeling with Prioritized Size 

In this section we exhibit prioritized exact distance labeling scheme tailored for graphs that admit a small 
separator. We say that a graph G = (V, E) admit an s-separator, if for any weight function w : V -X M + , 
there exists a set U C V of size \U\ = s, such that each connected component C of G \ U, has w(C) < 
2w(V)/3. It is well known that trees admit a 1-separator, and graphs of treewidth k admit a k-separator. 

The basic idea for constructing an exact distance labeling scheme based on separators, is to create a 
hierarchical partition of the graph, each time by applying the separator on each connected component. Then 
the label of a vertex u consists of all distances to all the vertices in the separators of clusters that contain 
u. To answer a query between vertices a. v, we return the minimum of d(u. s ) + d(v, s ) for all separator 
vertices s that u. v have in common in their labels (this is the exact distance, because at some point a vertex 
on the shortest path from u to v must be chosen to be in a separator). Since at every iteration the number of 
vertices in each cluster drops by at least a constant factor, after 0(log n) levels the process is complete, thus 
the label size is at most 0(s log n). 

Our improved label size for vertices of high priority, will be based on the following observation: If the 
weight function w is an indicator for a set S C V (that is, if u G S then w(u) = 1, and if u G V \ S then 
w(u) = 0), then after [log |S|] + 1 iterations, all vertices of S must have been removed from the graph. 


Theorem 10. Let G = (V. E ) be a graph admitting an s-separator, and let V = {x\,..., x n ) be a priority 
ranking of the vertices. Then there exists an exact distance labeling scheme with prioritized label size 
Q(s ■ log j). 


Proof. Let ,S’o = {x i, X 2 }, and for 1 < i < [log log n] let S, = { x 3 : f 2 ' 1 < j < 2 r }. The hierarchical 
partition will be performed in log log n phases. The z-th phase consists of 2* + 1 levels. In each level of 
the z-th phase, we generate an .s-separator for each remaining connected component C, with the following 
weight function 


j i u e Si n c 

\ 0 otherwise 


Then this separator is removed from the component. By the observation made above, after at most 1 + 
log |Si| < 2* + 1 levels, all remaining components have no vertices from .S',. The label of a vertex u E V 
will be the distances to all points in the separators created for components containing u. 

Fix some vertex Xj (for j > 1), and assume Xj G S t . Notice that 2* _1 < log j. Then the label size of Xj 
is at most 

i 

Y J S-{2 k + l) = 0{s-2 i ) = 0{s-\ogj). 


6.3.2 Planar Graphs and Graphs Excluding a Fixed Minor 

While exact distance labeling for planar graphs requires polynomial label size or query time, there is a 1 + e 
stretch labeling scheme for planar graphs with label size O(logn) [ThoOl, Kle02], which was extended to 
graphs excluding a fixed minor [AG06]. All these constructions are based on path separators: a constant 
number of shortest paths in the graph, whose removal induces pieces of bounded weight. The label of a 

9 For a set C C V, its weight is defined as ui(C) = f2 n ec w ( u )- 
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vertex consists of distances to carefully selected vertices on these paths. We may use the same methodology 
as above; generate these path separators for the sets S t in order, and obtain the following. 

Theorem 11. Let G = (V, E) be a graph excluding some fixed minor, and V = (x \,..., x n ) a priority 
ranking of the vertices. Then for any e > 0 there exists a distance labeling scheme with stretch 1 + e and 
prioritized label size 0((logj)/e). 


7 Routing 


7.1 Routing in Trees with Prioritized Labels 


In this section we extend a result of [TZOlb], and show a routing scheme on trees. The setting is that each 
vertex stores a routing table, and when a routing request arrives for vertex v, it contains L{v), the label of 
vertex v. We will show the following. 

Theorem 12. For any tree T = ( V, E ) there is a routing scheme with routing tables of size 0(1) and labels 
of prioritized size log j + 2 log log j + 4. 


Proof. The proof follows closely the one of [TZOlb], with the major difference being the assignments of 
weights, which gives preference to the high priority vertices. Thus ensuring that when routing from the root 
of the tree to a vertex of rank j , there are ~ log j junctions that require routing information from the label 
of the vertex. 

Let xi,... ,x n be the priority ranking of V. Let So = {aq} and for each 1 < i < logn, let Si = 
{xj : 2 l ~ l < j < 2*}. Fix an arbitrary root r of the tree T. For every v 6 Si define p(v) = yrjy-jp- Note 

that as | Si | < 2* we have that 


log n 

p{v) < y 

v€.V 2—0 



2 * 

2* ■ (i + l) 2 


< 2 . 


For each v £ V, define the weight of v as s v = f2 U £T v p(. u )’ where T v is the subtree rooted at v (including v 
itself). A child v' of v is called heavy if its weight is greater than s v / 2; otherwise it is called light. The root 
r of the tree will always be considered heavy. Observe that any vertex can have at most one heavy child. 
The light level £(v) of a vertex v is defined as the number of light vertices on the path from the root to v, 
denoted by Path(v ) = (r = no, v\,... ,Vk = v). The label size of v will be £{v) words. 

We enumerate all vertices T in DFS order, where all the light children of a vertex are visited before its 
heavy child is visited. (The order is otherwise arbitrary.) We identify each vertex v with its DFS number. 
Let f v denote the largest descendant of v. Also, let h v denote its heavy child, if exists. If it does not exist 
define h v = f v + 1. Also, let P(tt(v)) denote the port number of the edge connecting v to its parent tt(v), 
and P(h v ) denote the port number connecting v to its heavy child (if it exists). The routing table stored at v 
is (v, f v , h v , P(tt(v)), P(h v )). It requires 0(1) words. 

Each time an edge from a vertex to one of its light children is taken, the weight of the corresponding 
subtree decreases by at least a factor of 2. Note that a vertex v = xj E S r has weight at least w(v) > 
p(v) = 2 ’ an d since the root has weight at most 2, it follows that £(v) < log(2 • 2* • (i + l) 2 ) = 

i + 2 log(i + 1) + 1. Since 2 l ~ 1 < j, we conclude that 


£{v) < log j + 21og(log(j) + 2) + 2 . 


For each index q, 1 < q < £(v), denote by i q the index of q-th light vertex of Pathfv). Let L(v) = 
(v, (port(vi 1 -i,Vi 1 ),... ,port(vi e . v} _ 1 ,Vi e ,) ))) be the label of v, which consists of its name, and a sequence 


18 





of at most £(v) words containing the port numbers corresponding to the edges leading to light children on 

Path(v). 

The routing algorithm works as follows. Suppose we need to route a message with the header L(v) 
at a vertex w. The vertex w checks if w = v. If it is the case then we are done. Otherwise, w checks if 
v E [w, /„;]. If it is not the case, then v is not in the subtree of w, and then w sends the message to its parent. 
Otherwise w checks if v E [h w . /,,,]. If it is the case then the message is sent to the heavy child. Otherwise 
v is a descendent of a light child of w. The vertex w finds itself in the sequence of L(v), and determines to 
which light child of w the message should be sent. Then it sends the message to this child. 

m 


7.2 Routing in General Graphs 

To obtain routing scheme for general graphs, we use the same method as [TZOlb], but replace their distance 
labeling with our prioritized ones from Theorem 7. This routing scheme has the following property: after 
an initial calculation using the entire label of the destination vertex v, all routing decisions are based on a 
much shorter header appended to the message. In particular - , we obtain the following theorem. 

Theorem 13. For any graph G = (V, E) with priority ranking xi,...,x n ofV, and any parameter t > 1, 
there exists a routing scheme, such that the label size of Xj is at most log j ■ \] • (1 + o(l)), its header 
of size log j • (1 + o(l)), and it stores a routing table of size O^n 1 ^ ■ log j). Routing from any vertex into Xj 
will have stretch at most — 3. 

Sketch. We use the definitions of Section 5.2. Consider the distance labeling scheme given in Theorem 7. 
Following [TZ05], this labeling scheme yields a tree-cover: a collection of subtrees such that vertex v = x } 
belongs to at most |i?(n)| trees. The tree T z for vertex z contains z as the root, and the shortest path to all 
the vertices in C(z) = {x E V : z E B(x )}. To route from some vertex u E V’ to v, it suffices to find an 
appropriate z E B(u') f] B(v), and route in T z by applying Theorem 12. 

The routing table stored at each vertex v E V contains the hash table for its bunch B(v), and the routing 
table needed to route in T z for each z E B{y). Recall that by Theorem 7, j/i(tjj < ()(n ]/!t • logj) (where 
v = Xj), and by Theorem 12, the routing table of each tree is of constant size. Let i be the minimal such that 
v = Xj E Si. The label of v is ((pi(v), Li(v )),..., (pt-i(v), L t -±(v))), where Lh{v) is the label of v that is 
required to route in T pf j v y Note that the label is of size (t — i ) logj • (1 + o(l)) = logj • • (1 + o(l)) 

(the equality follows from a calculation done in Section 5.2). 

Finding the tree which guarantees the prioritized stretch as in Theorem 7 could have been achieved by 
using Algorithm 2, alas, this requires knowledge of the bunches of both vertices u and v. It remains to see 
that using only the label of v and the routing table at u, one can find a tree in the cover which has stretch at 
most 4 ^ 1 — 3 for u, v (routing in the tree does not increase the stretch). To see this, let i < h < t — 1 
be the minimal such that ph(v) E B(u). Following [TZOlb], we prove by induction that for each i < k < h 
it holds that 

• d(v,pk(v)) < 2 (k — i) ■ d(u, v ), 

• d(u,pk{v)) < (2(k — i) + 1) • d(u, v). 

The base case for k = i holds as v = pfv), assume for k, and for - k + 1 it suffices to prove the first item, as 
the second follows from the first by the triangle inequality. Since k < h .it follows that pk(v) £ B(u ), thus 
it must be that d(u,pk+i{u)) < d(u,ph(v)). Now, 

d(v,p k +i(v)) < d(v,p k +i{u)) < d(v,u)+d(u,p k+1 (u)) < d(v,u)+d(u,p k (v)) < (2(k-i)+2)-d(u,v) , 
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where the last inequality uses the induction hypothesis. Finally, routing through the shortest path tree rooted 
at ph(v) will have stretch at most 


d{u,p h (v)) + d(p h (v),v) < (4(h-i) + l)-d(u,v) < (4(i-i) -3) -d{u,v) 


(4 


flogj 

logn 


— 3) • d(u, v ) , 


using that h < t — 1 and that t — i = \] ■ This concludes the bound of the stretch. Note that once the 
vertex ph(v) is found, all other vertices on route from u to v only require the information (;Ph{v). L/ft')), 
which is appended to the message as a header of size log j ■ (1 + o(l)). 

□ 


Corollary 4. Any graph G = ( V,E ) with a priority ranking x\,... ,x n has a fully prioritized routing 
scheme, such that the label size of Xj is at most log 2 j ■ (1 + o(l)), its header will be of size log j ■ (1 + o(l)), 
and it stores a routing table of size O(logy). Routing from any vertex into Xj will have stretch at most 
4[logjl - 3. 


8 Prioritized Embedding into Normed Spaces 

8.1 Embedding with Prioritized Distortion 

In this section we study embedding arbitrary metrics into normed spaces, where the distortion is prioritized 
according to the given ranking of the points in the metric. Our main result is the following 

Theorem 14. For any p E [l,oo], e > 0, and any finite metric space (X. d) with priority ranking X = 
(xi,..., x n ), there exists an embedding ofX into £ p ^° g ^ with priority distortion 0(log j-(log log j)( 1+£ )/ 2 ) 

Proof overview. Our improved distortion guarantee for high ranked points comes from a variation of 
Bourgain’s embedding [Bou85] of finite metric spaces into £ p space. Bourgain’s embedding is based on 
randomly sampling sets in various densities, and defining the coordinates as distances to these sets. Our first 
observation (see Lemma 2) is sampling points only from a subset K C X, suffices to obtain an embedding 
which is non-expansive for all pairs, and has bounded contraction for pairs in K x X. Furthermore, the 
contraction depends only on \K\, rather than on |X|. 

We then use a similar strategy as in previous sections, and partition X to roughly log log n subsets 
So, Si,, .S'i 0 g i () gwhere Si is of size ~ 2 2 ’. The doubly exponential size arises because for any u, v E Si, 
the logarithm of the ranking of u and of v differs by at most a factor of 2. For each i, we create the embedding 
fi that will ’’handle” pairs in S t x X, and concatenate all these functions / = ®!^ 0 logn a, • /*. Without 
the a.i factor, every pair will suffer a (logTog n) l/P term in the distortion due to expansion. We introduce 
these factors to the embedding, where cr* is such that < 1. In such a way, the function / is 

non-expansive, but we pay a small factor of 1 /a* in the distortion for pairs in S, x X. 

Lemma 2. Let ( X , d) be a metric space of size |X| = n, K C X a subset of size \K\ = k and a parameter 
p E [1, oo]. Then there is an non-expansive embedding of X into k ' such that the contraction of any 

pair in K x X is at most 0( log k). 

Proof Let m = 0(log 2 k), and / : K —> be a non-expansive embedding with contraction a = 0(log k) 
on the pairs of K x K, which exists due to [Bou85, LLR95]. Since / is a Frechet embedding, we claim 
that there is an extension / : X —> ^ log k " > of / (that is f(v) = f(v) for all v E K), such that / is also 
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non-expansive. To see this, note that / is defined as f(x) = m -1//p © •" 1 d (x. .4,) for some sets A, C K. 
One can then simply define f(x ) = m~ l / p ©(© d (x, Ai), which is indeed non-expansive. 

Let h : X —> M be defined by h{x) = d(x,K). The embedding F : X —> © is defined by the 
concatenation of these maps F = / © h. Since both of the maps /, h are non-expansive, it follows that for 
any x, y G X, 


l©(© - F(y)\\; < || f(x) - f(y)\\p + | h{x) - h(y)\r < 2 • d(x,yY , 

hence F has expansion at most 2 1 ,/p for all pairs. Let t € K and x G X, and let k x G K be such that 
d(x, K) = d(x, k x ) (it could be that k x = x). If it is the case that d(x, t) < 3a • d(x, k x ) then by the single 
coordinate of h we get sufficient contribution for this pair: 

||F(f) - F(x)\\ p > \h(t) - h(x )| = h(x) = d(x, k x ) > ^’ — . 

6a 

The other case is that d(x, t) > 3a ■ d(x. k x ), here we will get the contribution from /. First observe that by 
the triangle inequality, 


d(t, k x ) > d(t, x) — d(x, k x ) > d(t , x)(l — l/(3a)) > 2 d(t, x)/3 . (6) 

By another application of the triangle inequality, using that / is non-expansive, and that / has contraction a 
on K, we get the required bound on the contraction: 


\\F{t) — F(x)\\ p > 
> 
> 

> 

( 6 ) 

> 


11/(f) - f(x)\\p 

II/© - f(hs)\\p - II f(kx) - /(©lip 
II/© - f(k x )\\ p - d(x, k x ) 


d(t, kx) 

d(t, x) 

a 

3a 

2d(t, x) 

d(t, x) 

3 a 

3a 

d(t, x) 


3a 



_ 1 1 

In particular, the function 2 p ■ F is non-expansive for all pairs, and has contraction at most 2?' • 3 • a = 

0 (log k) for pairs in K x X. □ 


We are now ready to prove Theorem 14. 

Proof of Theorem 14. Let So = {x i , X 2 } , and for 1 < i < [ log log n] let S t = j Xj : 2 2 ' < j < 2 2 |. 

For every i, let f : X ^ £ p be the embedding of Lemma 2 with K = Sj, and let a t = c ■ (i + 1© 1 +F/ p 
for sufficiently small constant c, so that of < 1. Finally, define the embedding f : X t p by 


[log log n] 

/ = a i ' fi 

i =0 
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To see that / is indeed non-expansive, we recall that each /j is non-expansive, we obtain that for any 

u, v £ x 


[log log n\ 

11/0) - f(v)\\p < 

i=0 


||/i(«) - fi(v)\\p < d(u,v) p ^2a^ < d(u, v) p . 

i=0 


For the contraction, let v = Xj for some j > 1 , and take any u £ X. Let i be the index such that v £ Si, 
and note that 2* 1 < logj. By Lemma 2, the embedding f) has contraction at most 0(log S',;|) = 0(2*) = 
0(log j) for the pair u, v. Observe that a p = c p ■ (i + l) - ( 1+e ) = fi ((2 + log log thus 


II f(u)-f(v) 


| p >a p 


II/(«) - /(u) 


l£>^ 


d(u, v) p 


(log j) p ■ (2 + log log j) ( 1+£ ) 


It is not hard to verify that x\ has constant contraction with any u, so the prioritized distortion is 
O (logj • (loglogj) -(1+e)/p ). Finally, since the dimension of ft is 0(log 2 15**|) = 0(2 2 *), the embed¬ 
ding / maps X into ^jiogiogO q(2 2 *) = 0(log 2 n) dimensions. For 1 < p < 2, one may embed 
first into £ 2 , use [JL84] to reduce the dimension to O(logra), and then apply an embedding to £p^ ogn \ 
while paying a constant factor in the distortion [FLM77]. The prioritized distortion will thus be at most 
0(logj • (loglogj) (1+e)/2 ). □ 


8.2 Embedding with Prioritized Dimension 

The main result of this section is an embedding with prioritized distortion and dimension. This means that 
a high ranking point will have low distortion (with any other point), and additionally, its image will consist 
of few nonzero coordinates, followed by zeros in the rest. 

Theorem 15. For any p £ [1, 00 ], any fixed e > 0 and any metric space ( X, d) on n points, there exists an 
embedding of X into (^ log with priority distortion O (log 4+f j), and prioritized dimension 0(log 4 j). 

Proof overview. The basic framework of this embedding appeal's at a first glance to be similar to Sec¬ 
tion 8.1, which is applying a variation of Bourgain’s embedding, while sampling only from certain subsets 
Si of the points. Flowever, the crux here is that we need to ensure that high priority points will be mapped 
to the zero vector in the embeddings that ’’handle” the lower ranked points. 

Recall that the coordinates of the embedding are given by distances to sets. The idea is the following: 
while creating the embedding for the points in Si, we insert all the points with higher ranking (those in 
So U • • • U Si- 1 ) into every one of the randomly sampled sets. This will certify that the high ranked points 
are mapped to zero in every one of these coordinates. However, the analysis of the distortion no longer 
holds, as the sets are not randomly chosen. Fix some point u £ Si and v £ X. The crucial observation 
is that if none of the higher ranked points lie in certain neighborhoods around u and v (the size of these 
neighborhoods depends on d(u, v)), then we can still use the randomness of the selected sets to obtain some 
bound (albeit not as good as the standard embedding achieves). While if there exists a high ranked point 
nearby, say c £ ,SV for some i' < i, then we argue that u, v should already have sufficient contribution from 
the embedding designed for ,S',/. The formal derivation of this idea is captured in Lemma 3. 

The calculation shows that the distortion guarantee for u, v deteriorates by a logarithmic factor for each 
i, that is, it is the product of the distortion bound for points in Si -1 multiplied by 0(log S, |). This implies 
that the optimal size of Si is triple exponential in i, which yields the best balance between the price paid due 
to the size of Si and the product of the logarithms of |5 q|, ..., 5',_ 1 1. 
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Lemma 3. Let p E [1, oo] and D > 1. Given a metric space ( X , d), two disjoint subsets A, K C X where 
\K\ = k > 2, a non-expansive embedding g : X £ p with contraction at most D for all pairs in 

Odos 2 k) 

Ax X, then there is a non-expansive embedding f : X —> £ p such that the following properties hold: 

1. For all x E A, f (x) = 0 . 

2. For all (x,y) E K x X, || f{x) - f(y)\\ p > 100 d 0 ( %^ gfc , or - g(y)\\ p > . 

We postpone the proof of Lemma 3 to Section 8.2.1, and prove Theorem 15 using the lemma. 

Proof of Theorem 15 . Let / = [log log log n]. Let ,S'o = {x\, X 2 , X 3 , x±}, and for 1 < i < I let S t = 
|Xj : 2 22 < j < 2 2 " |. Also define S<; = Uo<fc<* s k- 

The desired embedding F : X —> £ p will be created by iteratively applying Lemma 3, each time using 
its output function / as part of the input for the next iteration. Formally, for each 0 < i < I apply Lemma 3 
with parameters A = Sfj, K = Si, g = and D = 2 2!+5 * , to obtain a map /, : X —>• £ p . The map 

: X —> £ p is defined as follows: F ^ 1 ' 1 = 0, and F^ = ©[, =0 <^k ■ fk, where (o^) is a sequence that 

ensures FW is non-expansive for all i. For concreteness, take ■ The final embedding is 

defined by F = F^\ 

Fix any pair x, y E X. As /, is non-expansive by Lemma 3, we obtain that F is non-expansive as well: 

1 00 „ 

II F{x) - F(y)\\P = ^2a p r || fi(x) - fi(y)\\ p < 2 2 • d(x,y) p = d(x,y) p . 

i =0 i=o 7r ^ + ^ 

Next, we must show that for each 0 < i < I, the embedding F {l ~ h has contraction at most 2 2 ’ ‘ r ~ nI for pairs 
in 5<j x X, to comply with the requirement of Lemma 3. We prove this by induction on i, the base case for 
i = 0 holds trivially as F ( ~ 1 ' has no requirement on its contraction (since ,3'<o = 0). Assume (for i) that 
pb- 1 ) has contraction at most 2 2!+5 * on pairs in x X. For i + 1, let x E <S'<j + 1 and y E X. Recall 

that F’W is generated by applying Lemma 3 with A = 5<j, K = Si, g = F^~ 1 \ and D = 2 2I+5 * 2 . Then 
the lemma returns f, and finally F </l} = g ® (a, t ■ ff). 

We may assume that x E Si, otherwise g = F (l ~ 1 ;i has the required conti action on x, y by the induction 
hypothesis. Applying condition (2) of the lemma: if it is the case that IbO) -g(y)\\p > d(x, y)/ (2D), then 
clearly 2D < 2 2!+1 + 5 (*+ 1 ) 2 . The other case is that ||/j(x) - fi(y)\\ p > i 00 o1>[og|s,| • Since l°g I^I < 2 2 ' 
and 1 /ai < 2 {i + l) 2 , the conttaction of F^ is at most the contraction of ccj • /*, which is bounded by 

1000L> • log |$| < 1Q00 _ 2 2i + 5 * 2 . 2 2i ■ 2 (i + l) 2 < 2 2 ’ 2i+5 * 2+21og 0 +1 ) +11 < 2 2i+1+5 ( i+1 ) 2 

OLi ~ 

Observe that it= x 3 E .S', for some j > 1, then 2 2 ‘ 1 < logj, and thus the distortion of F for any 
pair containing x is at most 2 2!+1 + 5 (* +1 )~ = 0(log 4 j) ■ 2°(( 2+logloglogJ ) 2 ) = 0(log 4+e j). Additionally, 
note that as the distortion of F 11 ~ 1 is at most D = 2 2 +51 , the same argument suggests that the maximal 
distortion of F = F t,} for any pair is at most 

1000.D log n < iooQ 22 i +5I 2 logn . 2 ^ + ^2 = 0 ( log 3+6 n ) 
cti 

Finally, let us bound the number of nonzero coordinates of the points. Recall that /, maps X into 
0(log 2 |Si|) < 0( 2 2 ' +1 ) dimensions. Fix some x = xj for j > 1, and let i be such that Xj E Sp Note 
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that 2 2 ' 1 < logj, so that 2 2 ‘ +1 < log 4 j. By Lemma 3, for every i! > i, = 0, and the number of 

coordinates used by F W is at most 


Y J 0{2 lk+1 ) = 0(2 T+1 ) = 0(\og i j) 


k =0 


Since the dimension of fj is at most 0(log 2 n), we get that the total number of coordinates used by F 
is only 


i -1 


0( 2 2 +1 ) + 0(log 2 n) < 0(2 


2 l+log log log n 


) + 0(log 2 n) = 0(log 2 n) . 


k =0 


□ 


8.2.1 Proof of Lemma 3 


The basic approach to the proof is similar to Lemma 2, which is sampling subsets of K, according to various 
densities. The main difference is that we insert all the points of A into each sampled set, to ensure f(x) = 0 
for all x £ A. The standard analysis of Bourgain for a pair x, y, considers certain neighborhoods defined 
according to the density of points around x, y. We show that the analysis still works as long as no point of A 
is present in those neighborhoods. Thus we can obtain a contribution which is proportional to the distance 
of x, y to A (or to d(x, y) if that distance is large). This motivates the following definition and lemma. 

Definition 2. The 7- distance between x and y with respect to A is defined to be 

7 A (x, y) = min j ^ V ^ , d(x, A),d(y, A) j . 


Lemma 4. Let c = 24. There exists a non-expansive embedding <p : X 
<p(z ) = 0, and for all x,y E K, 


\\p(x) - <p(y)\\ p > 


7 A(x,y) 
clog k 


£p (l°g k \ such that for all z € 


A, 


We defer the proof of Lemma 4, and proceed first with the proof of Lemma 3. Define h : X —> M for 
i£las h(x) = d(x, A U I \). Our embedding / is 


/ 


p © h 

2 1 /p 


Since both p and h are non-expansive and vanish on A, clearly / is non-expansive as well, and f(z ) = 0 
for any z G A. It remains to show property (2) of the lemma. Fix any x <E K and y e X, and consider the 
following three cases: 


Case 1: d({x,y} ,A) < 

In this case we shall use the guarantees of the map g. Assume w.l.o.g that z G A is such that d(y, z) < 
d( ^ ] . Then by the triangle inequality 


d(x, z) > d(x, y) - d(y, z) > d(x, y) 


d(x,y) 3d(x,y) 
AD ~ 4 


(7) 


24 









Now, using that g is non-expansive, and has contraction at most D for any pair in Ax X, we obtain that 


\\g(x) - g(y)\\ p > \\g(x) - g(z)\\ p - \\g{z) - g(y)\\ p 

d(x,z ) , , 

> —^- d(z,y) 

M 3 d{x,y) _ d(x,y ) 

4 D AD 

d(x, y ) 

2D ’ 

which satisfies property (2). 

Case 2: d ({x, y} , A) > and d(y, K) > k (where c = 24 is the constant of Lemma 4). 

Here we shall use the map h for the contribution. Since d(y. A) > d[x, y) /(4/4), we have that h(y) = 

d(y, A U K) > 20cD.’iog fc an< ^ course ^( x ) = so 


ll/(®) - f(y)\\p > 


I h(x) - h(y) | 


> 


d(x,y) 

40 cD • log fc 


as required. 


Case 3: ri ({s, y} , A) > and d( yj k) < 20 S.’^ gfc . 

In this case, the function 93 will yield the required contribution, by employing a similar strategy to 


Lemma 2. Let k y G K be such that d(y,k y ) = d(y,K). Note that d(k y ,A ) > d(y,A ) — d(y,k y ) > 

- 2oSfogfc > and k follows that 


7 A(x,k y )>^^~. ( 8 ) 

By Lemma 4, since / is non-expansive, and using another application of the triangle inequality, we conclude 
that 


ll/(®) - f(y)\\ P 


> 

\\f( x ) ~ f( k y)\\p~ 

II f(y) - f(ky) II 

> 

\\p(x) - (p(k y )\\p 

2 

■ d{y, ky) 

> 

7,4 (x,k y ) d(x 

,y) 


2c log k 20 cD ■ 

■ log k 

(8) 

> 

d(x, y) d(x, y) 


10 cD ■ log k 20 cD • log k 


d(x,y ) 



20 cD ■ log k 



This concludes the proof of Lemma 3. It remains to validate Lemma 4, which is similar in spirit to the 
methods of [Bou85, LLR95], we give full details for completeness. 


Proof of Lemma 4. Let I = [log k] and J = C ■ log k for a constant C that will be determined later. For 
each * G [/] and j G [J] sample a set Q[- by including each x G K independently with probability 2~ l , 
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and let Qij = Q[- U A. Define maps ipij : X — > M by letting for each u E X, ip tJ (u) = d(u, Qij), and 
V? : X -» e v J by 

^ u) = (I jWp © © ^ {u) ' 

Since each is non-expansive, ip is non-expansive as well, and in what follows we bound its contraction. 

Define for u E K and r > 0 the ball restricted to K, Bk(u, r) = B(u , r ) n K, and recall that by B° 
we mean the open ball. Fix a pair u. v E K, and for each 0 < i < /, let r- be the minimal such that both 

| Bk{u, r)| >2* and \Bk(v, r)| > 2*. Define n = min{r/ 7 a{u, n)} and let A,; = r j — rj_ 1 . Observe that 
ro = 0 and 77 = 74 ( 74 , v), so that 

^ j A i ='y A (u,v). (9) 

te[i] 

We first claim that for each i E [I] and j E [J], 

Pr[| (pij(u) - <^ 7 '(n)| > Aj] > 1/12 . (10) 

If Aj = 0 then there is nothing to prove. Assume then that r,_ 1 < 77, and note that either \B° K {u, r ,) < 2* 
or t'j) < 2* (otherwise it contradicts the minimality of 77). W.l.o.g we have that 77 ) | < 2*. 

Furthermore, note that the sets B° K (u, 77), Bk{v, 77 _ 1 ) and A are pairwise disjoint. Let £ be the event that 
{Qjj FI B° k {u , 77) = 0 } and J 7 be the event that {Qjj n Bk{v, 77 - 1 ) / 0 }. Observe that if both events hold 
then d ( u , Qij ) > 77 and d(n, Qij) < 1, so that 

|¥7?(w) - > ry - 7y_i = Aj . 


Since both balls are disjoint from A, we have that 

Pr[£] = Pr [s 0 Q-j-] = (l - 2 -*) > (l - 2-f > \ . 

x&B° K (u,ri) 

And similarly, 

Pr[A] = 1 - TT Pr [x i Q'^] = 1 - (1 - 2 - i ) |Bx(,, ’ ri - l)l > 1 - (l - 2-’) 2 " 1 > 1 - e~3 > I 

O 

Since the events £ and T arc independent, this concludes the proof of (10). Let X r[] be an indicator random 
variable for the event that \p l3 (u) — <Pij(v) \ > Aj, and Aj = Xjj. Using the independence for 

different values of j, and that E[Aj] > J/12, a Chernoff bound yields that for any i 

Pr[Aj < J/24] < e“ J/10 ° < 1/fc 3 , 


when C is sufficiently large. Note that if indeed Aj > J/24 for all 1 < i < I then 

, 1 J 



I ■ J 

> 

1 X 


24/ * 


i 


ji-p 

> 



24/ 

(9) 

> 

7 a(u 


i=t j =1 
1 


1=1 


E A - 


2=1 


24 IP 
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where the second inequality uses Holder’s inequality. Applying a union bound over the (5,) possible pairs 

in ( h 2 j , and the I = [log k~\ possible values of i, there is at least a constant probability that for every pair 

II¥>(«) ~ <P( v )\\p > 24 7 vSoifc - 

□ 
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